Search CORE

61 research outputs found

Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence

Author: Bai Mingsian R.
Hsu Yicheng
Publication venue
Publication date: 18/04/2023
Field of study

Personal voice activity detection has received increased attention due to the growing popularity of personal mobile devices and smart speakers. PVAD is often an integral element to speech enhancement and recognition for these applications in which lightweight signal processing is only enabled for the target user. However, in real-world scenarios, the detection performance may degrade because of competing speakers, background noise, and reverberation. To address this problem, we proposed to use equivalent rectangular bandwidth ERB-scaled spatial coherence as the input feature to train an array configuration-agnostic PVAD network. Whereas the network model requires only 112k parameters, it exhibits excellent detection performance and robustness in adverse acoustic conditions. Notably, the proposed ARCA-PVAD system is scalable to array configurations. Experimental results have demonstrated the superior performance achieved by the proposed ARCA-PVAD system over a baseline in terms of the area under receiver operating characteristic curve and equal error rate.Comment: Accepted by INTER-NOISE 2023. arXiv admin note: text overlap with arXiv:2211.0874

arXiv.org e-Print Archive

A Wearable Indoor Navigation System for Blind and Visually Impaired Individuals

Author: Bai Yicheng
Publication venue
Publication date: 28/01/2015
Field of study

Indoor positioning and navigation for blind and visually impaired individuals has become an active field of research. The development of a reliable positioning and navigational system will reduce the suffering of the people with visual disabilities, help them live more independently, and promote their employment opportunities. In this work, a coarse-to-fine multi-resolution model is proposed for indoor navigation in hallway environments based on the use of a wearable computer called the eButton. This self-constructed device contains multiple sensors which are used for indoor positioning and localization in three layers of resolution: a global positioning system (GPS) layer for building identification; a Wi-Fi - barometer layer for rough position localization; and a digital camera - motion sensor layer for precise localization. In this multi-resolution model, a new theoretical framework is developed which uses the change of atmospheric pressure to determine the floor number in a multistory building. The digital camera and motion sensors within the eButton acquire both pictorial and motion data as a person with a normal vision walks along a hallway to establish a database. Precise indoor positioning and localization information is provided to the visually impaired individual based on a Kalman filter fusion algorithm and an automatic matching algorithm between the acquired images and those in the pre-established database. Motion calculation is based on the data from motion sensors is used to refine the localization result. Experiments were conducted to evaluate the performance of the algorithms. Our results show that the new device and algorithms can precisely determine the floor level and indoor location along hallways in multistory buildings, providing a powerful and unobtrusive navigational tool for blind and visually impaired individuals

D-Scholarship@Pitt

Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

Author: Bai Mingsian R.
Chang Hsinyu
Hsu Yicheng
Publication venue
Publication date: 22/10/2023
Field of study

Recent research advances in deep neural network (DNN)-based beamformers have shown great promise for speech enhancement under adverse acoustic conditions. Different network architectures and input features have been explored in estimating beamforming weights. In this paper, we propose a deep beamformer based on an efficient convolutional recurrent network (CRN) trained with a novel ARray RespOnse-aWare (ARROW) loss function. The ARROW loss exploits the array responses of the target and interferer by using the ground truth relative transfer functions (RTFs). The DNN-based beamforming system, trained with ARROW loss through supervised learning, is able to perform speech enhancement and speaker localization jointly. Experimental results have shown that the proposed deep beamformer, trained with the linearly weighted scale-invariant source-to-noise ratio (SI-SNR) and ARROW loss functions, achieves superior performance in speech enhancement and speaker localization compared to two baselines.Comment: 6 page

arXiv.org e-Print Archive

The suppression of Finite Size Effect within a Few Lattices

Author: Bai Kai
Chan C. T.
Lai Yun
Liu Tao
Wan Duanduan
Xiao Meng
Zhang Yicheng
Publication venue
Publication date: 28/04/2023
Field of study

Boundary modes localized on the boundaries of a finite-size lattice experience a finite size effect (FSE) that could result in unwanted couplings, crosstalks and formation of gaps even in topological boundary modes. It is commonly believed that the FSE decays exponentially with the size of the system and thus requires many lattices before eventually becoming negligibly small. Here we identify a special type of FSE of some boundary modes that apparently vanishes at some particular wave vectors along the boundary. Meanwhile, the number of wave vectors where the FSE vanishes equals the number of lattices across the strip. We analytically prove this type of FSE in a simple model and prove this peculiar feature. We also provide a physical system consisting of a plasmonic sphere array where this FSE is present. Our work points to the possibility of almost arbitrarily tunning of the FSE, which facilitates unprecedented manipulation of the coupling strength between modes or channels such as the integration of multiple waveguides and photonic non-abelian braiding.Comment: 22 pages, 8 figure

arXiv.org e-Print Archive

Binary Star Evolution in Different Environments: Filamentary, Fractal, Halo and Tidal-tail Clusters

Author: Bai Jing
Chen Wen-Ping
Chuang Rwei-ju
Feng Fabo
Kouwenhoven M. B. N.
Li Chengyuan
Pang Xiaoying
Rui Yicheng
Tang Shih-Yun
Wang Yifan
Publication venue
Publication date: 13/07/2023
Field of study

Using membership of 85 open clusters from previous studies (Pang et al. 2021a,b, 2022b; Li et al. 2021) based on Gaia DR3 data, we identify binary candidates in the color-magnitude diagram, for systems with mass ratio q > 0.4. The binary fraction is corrected for incompleteness at different distances due to the Gaia angular resolution limit. We find a decreasing binary fraction with increasing cluster age, with substantial scatter. For clusters with a total mass > 200

M_\odot

, the binary fraction is independent of cluster mass. The binary fraction depends strongly on stellar density. Among four types of cluster environments, the lowest-density filamentary and fractal stellar groups have the highest mean binary fraction: 23.6% and 23.2%, respectively. The mean binary fraction in tidal-tail clusters is 20.8%, and is lowest in the densest halo-type clusters: 14.8%. We find clear evidence of early disruptions of binary stars in the cluster sample. The radial binary fraction depends strongly on the cluster-centric distance across all four types of environments, with the smallest binary fraction within the half-mass radius

r_h

, and increasing towards a few

r_h

. Only hints of mass segregation is found in the target clusters. The observed amount of mass segregation is not significant to generate a global effect inside the target clusters. We evaluate the bias of unresolved binary systems (assuming a primary mass of 1

M_\odot

) in 1D tangential velocity, which is 0.1-1

\,\rm km\,s^{-1}

. Further studies are required to characterize the internal star cluster kinematics using Gaia proper motions

arXiv.org e-Print Archive

Toward highly potent cancer agents by modulating the c-2 group of the arylthioindole class of tubulin polymerization inhibitors

Author: Bai Ruoli
Brancale Andrea
Chen Feng
Coluccia Antonio
Costa Barbara
Da Pozzo Eleonora
Di Cesare Erica
Dondio Giulio
Famiglini Valeria
Granata Ilaria
Hamel Ernest
Iannitto Maria Luisa
La Regina Giuseppe
Lavia Patrizia
Li Junjie
Maresca Bruno
Martini Claudia
Mercurio Ciro
Miranda Cona Marlein
Nalli Marianna
Ni Yicheng
Novellino Ettore
Pelliccia Sveva
Piscitelli Francesco
Porta Amalia
Reggio Alessia
Rensen Whilelmina Maria
Santoni Angela
Silvestri Romano
Soriani Alessandra
Varasi Mario
Vultaggio Stefania
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2013
Field of study

New arylthioindole derivatives having different cyclic substituents at position 2 of the indole were synthesized as anticancer agents. Several compounds inhibited tubulin polymerization at submicromolar concentration and inhibited cell growth at low nanomolar concentrations. Compounds 18 and 57 were superior to the previously synthesized 5. Compound 18 was exceptionally potent as an inhibitor of cell growth: it showed IC50 = 1.0 nM in MCF-7 cells, and it was uniformly active in the whole panel of cancer cells and superior to colchicine and combretastatin A-4. Compounds 18, 20, 55, and 57 were notably more potent than vinorelbine, vinblastine, and paclitaxel in the NCI/ADR-RES and Messa/Dx5 cell lines, which overexpress P-glycoprotein. Compounds 18 and 57 showed initial vascular disrupting effects in a tumor model of liver rhabdomyosarcomas at 15 mg/kg intravenous dosage. Derivative 18 showed water solubility and higher metabolic stability than 5 in human liver microsomes

Archivio della ricerca - Università degli studi di Napoli Federico II

Online Research @ Cardiff

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza